Comparative Genome analysis of the Genus Curvibacter and the Description of Curvibacter microcysteis sp. nov. and Curvibacter cyanobacteriorum sp. nov., Isolated from Fresh Water during the Cyanobacterial Bloom Period

The three Gram-negative, catalase- and oxidase-positive bacterial strains RS43T, HBC28, and HBC61T, were isolated from fresh water and subjected to a polyphasic study. Comparison of 16S rRNA gene sequence initially indicated that strains RS43T, HBC28, and HBC61T were closely related to species of genus Curvibacter and shared the highest sequence similarity of 98.14%, 98.21%, and 98.76%, respectively, with Curvibacter gracilis 7-1T. Phylogenetic analysis based on genome sequences placed all strains within the genus Curvibacter. The average nucleotide identity (ANI) and digital DNA–DNA hybridization (dDDH) values between the three strains and related type strains supported their recognition as two novel genospecies in the genus Curvibacter. Comparative genomic analysis revealed that the genus possessed an open pangenome. Based on KEGG BlastKOALA analyses, Curvibacter species have the potential to metabolize benzoate, phenylacetate, catechol, and salicylate, indicating their potential use in the elimination of these compounds from the water systems. The results of polyphasic characterization indicated that strain RS43T and HBC61T represent two novel species, for which the name Curvibacter microcysteis sp. nov. (type strain RS43T =KCTC 92793T=LMG 32714T) and Curvibacter cyanobacteriorum sp. nov. (type strain HBC61T =KCTC 92794T =LMG 32713T) are proposed.


Isolation and Cultivation
Strains RS43 T , HBC28, and HBC61 T were isolated from Daechung Reservoir (36° 22 33.7 N, 127° 38 20.6 E) according to the procedures previously described [7].Briefly, the sample was diluted by the standard ten-fold serial dilution method and was spread on agar plates of Reasoner's 2A (R2A).After incubation at 25 o C for 1 week under aerobic conditions, colonies were picked and purified two times by repeated subculture.The strains were then preserved in R2A medium containing 20% glycerol (v/v) at -80 o C. Considering the close phylogenetic relationships, Curvibacter gracilis KCTC42831 T (=7-1 T ), Curvibacter lanceolatus KCTC42829 T (=IAM 14947 T ), and Curvibacter delicatus KCTC42828 T (=IAM 14955 T ) procured from KCTC were chosen as reference strains and they were compared in the aspect of polyphasic characterization.

Molecular Identification, Genome Sequencing, and Pangenome Analysis
The whole genomic DNA was extracted using FastDNA Spin Kit for Soil (MB Biomedicals, USA).For strain identification, the 16S rRNA gene was amplified and sequenced with the universal bacterial primers of 27F, 518F, 785F, 805R, 907R, and 1492R [8][9][10].Pairwise comparisons of the almost-complete 16S rRNA gene sequences of three strains and those of related taxa were performed on the latest updated EzBioCloud (www.ezbiocloud.net)[11] and NCBI blast (www.ncbi.nlm.nih.gov).The draft genomes of three strains were sequenced using the Illumina MiSeq (Macrogen Inc., South Korea) platform and annotated by Prokka v.1.12and Rapid Annotation using Subsystems Technology (RAST).The metabolic pathways of strains RS43 T , HBC28, and HBC61 T were reconstructed using BlastKOALA [12].The G+C content was calculated from the whole genome sequence.For comparative genome analysis, the genomes of Curvibacter species were obtained from NCBI (Table 1).Pangenomic analyses were performed using the EDGAR 3.0 platform [13].The biosynthetic gene clusters of secondary metabolites were searched using AntiSMASH 7 beta [14] with strict detection.

Phylogeny
The 16S rRNA gene sequences of three strains and other closely related strains obtained from the GenBank/ EMBL/DDBJ databases were aligned using the MUSCLE algorithm.To identify the phylogenetic location of three strains, phylogenetic trees were constructed by using maximum-likelihood [15], neighbor-joining [16], and minimum-evolution [17] methods with the software MEGA version 11 [18].The confidence levels and distance matrices of phylogenetic trees were determined with 1000 bootstrap replicates and Kimura's two-parameter model [19], respectively.To provide cogent evidence for the assignment of strains RS43 T , HBC28, and HBC61 T as novel species, the ANI and dDDH values were calculated using the OrthoANI algorithm and the Genome-to-Genome Distance Calculator (GGDC 2.0) [20], respectively.The two-way average amino acid identity (AAI) was calculated using AAI calculator (http://enve-omics.ce.gatech.edu/aai/)and percentage of conserved proteins (POCP) values were by using EDGAR 3.0 platform [13].A phylogenomic tree was built utilizing the Type (Strain) Genome Server (TYGS) [21].

Chemotaxonomic Characterization
For chemotaxonomic analysis, all strains were grown on R2A for 3 days at 25°C.After harvesting the cells, the fatty acids were saponified, methylated, extracted, and analyzed according to the protocol recommended by MIDI Microbial Identification System (version 6.1, MIDI) [24].Polar lipids were extracted from lyophilized cells, separated and identified as described previously [25][26][27][28].Quinones were extracted and purified with a mixture of chloroform and methanol (2:1) for 3-4 h and analyzed by HPLC [29].

Phylogenetic Characteristics
The 16S rRNA gene sequences of strains RS43 T , HBC28, and HBC61 T determined by the Sanger sequencing were 1474, 1484, and 1471 bp in length, respectively.Strains RS43 T , HBC28, and HBC61 T were most closely related to Curvibacter gracilis 7-1 T with 16S rRNA sequence similarities of 98.14%, 98.21%, and 98.76%, respectively.The 16S rRNA gene sequence of strains RS43 T and HBC61 T shared 99.35% similarity with each other.The phylogenetic trees revealed that the three strains formed a coherent cluster with C. gracilis 7-1 T and Curvibacter lanceolatus IAM 14947 T .This relationship was supported by high bootstrap percentages (Fig. 1).Therefore, the three strains were phylogenetically affiliated with the genus Curvibacter.

Genomic Characteristics
The draft genome sequences of strains RS43 T , HBC28, and HBC61 T had a total nucleotide length of 4,895,745 bp (24 contigs), 4,822,995 bp (21 contigs), and 4,848,818 bp (19 contigs), respectively, with the sequencing depth of 148×, 147×, and 147×, respectively.Overall, the draft genome met the criteria required for the taxonomic purposes proposed [30].The G+C contents of strains RS43 T and HBC28 were estimated to be 65.22 % and 65.29 %, which correspond with the G+C content range (62.2-66.0mol%) of members of the genus Curvibacter [1].
The genomic features of Curvibacter species are presented in Table 1.The draft genome sequences of Curvibacter species vary significantly in size, ranging from 3.78 Mbp for Curvibacter delicatus NBRC 14919 T to 6.83 Mbp for Curvibacter lanceolatus ATCC 14669 T .Additionally, the DNA G+C contents varied among the strains, with C. delicatus NBRC 14919 T having the lowest content (63.5%) and Curvibacter cyanobacteriorum HBC61 T showing the highest content (67.15%).There was a significant difference in the numbers of predicted genes among Curvibacter species, which ranged from 3678 to 6367.

Whole-Genome-Based Phylogeny
The AAI and POCP values for the three strains with the type species of the genus, C. gracilis, were found to be between 80.27-80.50%and 78.00-79.79%,respectively, which were higher than the genus boundaries of 60-80% for AAI [31] and 50% for POCP [32] (Figs.2A-2B).These results supported that the three strains belong to the genus Curvibacter.Strains RS43 T and HBC28 were found to be the same species with an ANI value of 98.13% (Fig. 2C) and a dDDH value of 83.1% (Fig. 2D).However, the ANI and dDDH values between strains RS43 T and HBC61 T were 86.29% and 30.7%, respectively, implying that they represent two separate Curvibacter species.The dDDH and ANI scores between the three strains and their related type strains were much lower than the recommended species cut-off level (dDDH, 70%; ANI, 95-96%), revealing their novel status within the genus Curvibacter [33,34].The phylogenomic tree showed that the closest neighbors of the three strains were C. gracilis 7-1 T and Curvibacter lanceolatus IAM 14947 T (Fig. 3).Notably, Curvibacter delicatus formed a distinct clade separated from other members of the genus Curvibacter, indicating that this species may belong to other genus.Further studies are needed to identify the exact taxonomic position of this species.In summary, the wholegenome-based phylogeny clearly suggests that strains RS43 T and HBC61 T should be classified as two novel species within the genus Curvibacter.

Comparative Genomic Analysis
The distribution of specific genomic regions of Curvibacter species was shown in Fig. 4A.Functional analysis using the KEGG and COG database revealed that these species displayed similar distribution patterns of subsystem categories (Figs.S1-S2).For instance, most genes were involved in vital central metabolic pathways, such as "amino acid transport and metabolism", "signal transduction mechanisms", "transcription", and "energy production and conversion".In total, the "core genome" consisted of 2034 genes (Fig. 4B).41.6% of genes were assigned to the dispensable category, whereas the percentage of singletons was estimated to be 36.9%(Fig. 4C).The lifestyle of the bacterial genus can be partly reflected by its pangenome [35].Bacteria that inhabit a limited niche tend to have a closed pangenome, whereas bacteria living in a community generally possess an open   pangenome with a high rate of horizontal gene transfer [35].The pangenome is classified as open or closed, based on the exponent gamma value of Heaps' law [36].The pangenome size of Curvibacter increased while its coregenome size decreased as more genomes were added (Figs. 4D-4E).The gamma value of Heaps' law was estimated to be 0.386, which is higher than 0, implying that Curvibacter pangenome is open (Fig. 4E) [37].
The core gene proportion of all strains of Curvibacter species was more enriched with COG functions involved in "RNA processing and modification", "chromatin structure and dynamics", and "translation".In contrast, most singleton genes were related to "transposase", "defense mechanisms", and "intracellular trafficking" (Fig. 5A).To understand the ecological function of Curvibacter species, their potential for vitamin synthesis and biosynthesis of secondary metabolites was predicted.Bacteria produce several secondary metabolites to communicate with other microorganisms [38].Strains RS43 T , HBC28, and HBC61 T possessed putative secondary metabolite gene clusters potentially related to the synthesis of terpene and beta-lactone (Fig. 5B).Terpene, which was also predicted in all Curvibacter species, plays an essential role in the defense mechanisms of plants and fungi [39].Beta-lactones have been reported to show antimicrobial activities and can be used as building blocks for complex compounds such as antibiotics and anticancer drugs [40].Among Curvibacter species, C. gracilis ATCCBAA-807 T harbored the highest secondary metabolite biosynthetic gene clusters.This strain can produce terpene, betalactone, Type 1 polyketide synthase (T1PKS), and nonribosomal peptide synthetase (NRPS).As NRPS was predicted in the genome of C. gracilis ATCCBAA-807 T , this strain may produce antimicrobial, antiviral, anticancer, and anti-inflammatory compounds [41].Bacteria can support algal growth by supplying vitamin B and iron [42].Curvibacter species can promote the growth of the cyanobacterium Microcystis [43].Given that Curvibacter species possess the biosynthetic pathway for cobalamin (vitamin B 12 ) and biotin (vitamin B 7 ) in their genome (Fig. 5C), they have the potential to provide these vitamins to cyanobacteria, thereby facilitating the formation of cyanobacterial blooms.
Aromatic compounds, which constitute approximately 25% of the world's biomass, are the second most prevalent group of organic compounds in nature after carbohydrates [44].They are considered one of the most persistent pollutants in the environment [45].The metabolic capacities of Curvibacter species predicted using BlastKOALA suggested that they could metabolize the xenobiotic substances, such as benzoate (C.gracilis ATCC BAA-807 T and C. lanceolatus ATCC 14669 T ), phenylacetate (C.gracilis ATCC BAA-807 T ), catechol (all strains except for C. delicatus NBRC 14919 T ), and salicylate (RS43 T and HBC28) (Fig. 5C).Therefore, they have the potential to remove these compounds from water systems.

Phenotypic Characteristics
The cells of strains RS43 T , HBC28, and HBC61 T were observed to be Gram-negative, rod-shaped with flagella (Fig. S3), and catalase-and oxidase-positive.The colonies appeared circular, smooth, convex, and colorless with a diameter of 1-2 mm after 3 days of growth on R2A agar medium at 25 o C.These strains grew well on R2A and NA media but not on LBA and TSA media.They were positive for hydrolysis of Tween 80, but negative for hydrolysis of Tween 20 and skim milk.Growth occurred within the pH range of 5.5 to 10. Several phenotypic features distinguish strain RS43 T and strain HBC61 T , confirming they belong to two different species.For instance, strain HBC61 T assimilated D-mannitol and gluconate, exhibited growth at 40°C, and was able to tolerate NaCl concentration up to 1% (w/v), whereas strain RS43 T did not.Table 2 provides further details on the additional phenotypic features distinguishing strains RS43 T , HBC28, and HBC61 T from their closely related strains.

Chemotaxonomic Characteristics
The fatty acid profile of strains RS43 T , HBC28, and HBC61 T showed a similar pattern to those of the reference strains (Table 3).All strains had major fatty acids (> 10% of total fatty acids) that included summed feature 3 (C 16:1 ω7c and/or C 16:1 ω6c), summed feature 8 (C 18:1 ω7c and/or C 18:1 ω6c), and C 16:0 .Strains RS43 T , HBC28, and HBC61 T possessed Q-8 as the major quinone, which is consistent with the ubiquinone feature of the genus Curvibacter.The major polar lipid profile identified in strains RS43 T and HBC61 T consisted of phosphatidylethanolamine, diphosphatidylglycerol, and unidentified lipids (Fig. S4).However, strain HBC61 T could be distinguished from strain RS43 T by the presence of an additional unidentified phospholipid as a major component.

Table 2. Differential phenotypic characteristics of strains RS43 T , HBC28, and HBC61 T and their phylogenetic neighbors.
C h a r a c t e r i s t i c   Cells are rod-shaped, 1.2-2.8μm in length and 0.7-1.0μm in width.Colonies grown on R2A agar are smooth, convex, and colorless with a 1-2 mm diameter.Motile with peritrichous flagella.Growth was observed at the temperature range of 10-37 o C, up to 0.5% NaCl, and pH 5.5-10.Catalase and oxidase are positive.Able to hydrolyze Tween 80. Positive for esterase (C4), esterase lipase (C8), leucine arylamidase, acid phosphatase, and naphthol-AS-BI-phosphohydrolase.The major fatty acids were summed feature 3 (C 16:1 ω7c and/or C 16:1 ω6c), summed feature 8 (C 18:1 ω7c and/or C 18:1 ω6c), and C 16:0 .The quinone was Q-8.The major polar lipids were phosphatidylethanolamine, diphosphatidylglycerol, and two unidentified lipids.

Description of
The type strain HBC61 T (=KCTC 92794 T = LMG 32713 T ) was isolated from fresh water.The 16S rRNA gene sequence and the genomic sequence of strain HBC61 T have been deposited under the GenBank/EMBL/DDBJ accession numbers OQ642154 and JAQSIP000000000, respectively.

Conclusions
Considering chemotaxonomic, phylogenetic, and phenotypic characteristics, strains RS43 T , HBC28, and HBC61 T belonged to the genus Curvibacter.Genome-relatedness indices, phenotypic and chemotaxonomic characteristics precisely differentiated them from all known Curvibacter species.Along with phylogenetic distinctiveness, strain HBC61 T was distinguished from RS43 T and HBC28 by several phenotypic and chemotypic features.Therefore, strains RS43 T , HBC28, and HBC61 T should be classified as two novel species of the genus Curvibacter, for which the names Curvibacter microcysteis sp.nov.(RS43 T = KCTC 92793 T = LMG 32714 T ) and Curvibacter cyanobacteriorum sp.nov.(HBC61 T = KCTC 92794 T = LMG 32713 T ).

Fig. 1 .
Fig. 1.Neighbor-joining phylogenetic tree based on the 16S rRNA gene sequences depicting the phylogenetic placements of three strains among the related species.Bootstrap values above 50% are shown at branch points.Closed circles indicate that the corresponding nodes were also recovered in the maximum-likelihood and minimum-evolution methods, whereas the open circles indicate nodes recovered with neighbor-joining and maximum-likelihood algorithms.

Fig. 4 .
Fig. 4. Comparative genomic analysis.Circular view of the genome of Curvibacter species (A).The number (B) and percentage (C) of common core genes across all Curvibacter species.Core genome (D) and pangenome (E) profiles of Curvibacter species.The fitted exponential Heaps' law function is represented by the red curve while the upper and lower boundaries of the 95% confidence internal are indicated by the green and blue curves, respectively.

Fig. 3 .
Fig. 3. Phylogenomic tree inferred with FastME 2.1.6.1 from Genome BLAST Distance Phylogeny (GBDP) distances calculated from genome sequences.The branch lengths are adjusted with the GBDP distance formula d5.The GBDP pseudo-bootstrap support values from 100 replications exceeding 60% are displayed above the branches.The average support across branches is 94.6%.

Fig. 5 .
Fig. 5. Functional genome analysis of Curvibacter species.The distribution of core, dispensable, and singletons (A).Gene clusters predicted in the genomes of Curvibacter species involved in secondary metabolite synthesis (B), Metabolism of cofactors and vitamins and xenobiotics biodegradation in the genomes of Curvibacter species (C).

Table 3 . Fatty acid compositions (%) of strains RS43 T , HBC28, and HBC61 T and their phylogenetic neighbors.
T. Major fatty acids (> 10.0%) are highlighted as bold type.TR, trace amount (< 1%); ND, not detected.*Summed features refer to fatty acids that cannot be distinctly differentiated from another fatty acid and these fatty acids are grouped as one feature with a single percentage of the total.Summed feature 3 contained C 16:1 ω6c and/or C 16:1 ω7c.Summed feature 8 contained C 18:1 ω7c and/or C 18:1 ω6c.